智能论文笔记

Few-shot Classification with Hypersphere Modeling of Prototypes

Ning Ding , Yulin Chen , Ganqu Cui , Xiaobin Wang , Hai-Tao Zheng , Zhiyuan Liu , Pengjun Xie

分类：机器学习 | 自然语言处理 | 计算机视觉

2022-11-10

Metric-based meta-learning is one of the de facto standards in few-shot learning. It composes of representation learning and metrics calculation designs. Previous works construct class representations in different ways, varying from mean output embedding to covariance and distributions. However, using embeddings in space lacks expressivity and cannot capture class information robustly, while statistical complex modeling poses difficulty to metric designs. In this work, we use tensor fields (``areas'') to model classes from the geometrical perspective for few-shot learning. We present a simple and effective method, dubbed hypersphere prototypes (HyperProto), where class information is represented by hyperspheres with dynamic sizes with two sets of learnable parameters: the hypersphere's center and the radius. Extending from points to areas, hyperspheres are much more expressive than embeddings. Moreover, it is more convenient to perform metric-based classification with hypersphere prototypes than statistical modeling, as we only need to calculate the distance from a data point to the surface of the hypersphere. Following this idea, we also develop two variants of prototypes under other measurements. Extensive experiments and analysis on few-shot learning tasks across NLP and CV and comparison with 20+ competitive baselines demonstrate the effectiveness of our approach.

translated by 谷歌翻译

Incorporating Casual Analysis into Diversified and Logical Response Generation

Jiayi Liu , Wei Wei , Zhixuan Chu , Xing Gao , Ji Zhang , Tan Yan , Yulin Kang

分类：自然语言处理 | 人工智能

2022-09-20

尽管条件变异自动编码器（CVAE）模型比传统的SEQ2SEQ模型可以产生更多的多样化响应，但响应通常与输入词的相关性低或与问题不合逻辑。进行因果分析以研究背后的原因，并提供了一种寻找调解人并减轻对话中混杂偏见的方法。具体而言，我们建议预测调解人，以保留相关信息，并自动将调解人纳入生成过程中。此外，动态主题图指导条件变异自动编码器（TGG-CVAE）模型用于补充语义空间并减少响应中的混杂偏置。广泛的实验表明，所提出的模型能够产生相关和信息性的响应，并且在自动指标和人类评估方面优于最先进的响应。

translated by 谷歌翻译

Making the Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation

Wenxuan Ma , Jinming Zhang , Shuang Li , Chi Harold Liu , Yulin Wang , Wei Li

分类：计算机视觉

2022-08-02

关于无监督的域适应性（UDA）的广泛研究已将有限的实验数据集深入学习到现实世界中无约束的领域。大多数UDA接近通用嵌入空间中的对齐功能，并将共享分类器应用于目标预测。但是，由于当域差异很大时可能不存在完全排列的特征空间，因此这些方法受到了两个局限性。首先，由于缺乏目标标签监督，强制域的比对会恶化目标域的可区分性。其次，源监督分类器不可避免地偏向源数据，因此它在目标域中的表现可能不佳。为了减轻这些问题，我们建议在两个集中在不同领域的空间中同时进行特征对齐，并为每个空间创建一个针对该域的面向域的分类器。具体而言，我们设计了一个面向域的变压器（DOT），该变压器（DOT）具有两个单独的分类令牌，以学习不同的面向域的表示形式和两个分类器，以保持域的可区分性。理论保证的基于对比度的对齐和源指导的伪标签细化策略被用来探索域名和特定信息。全面的实验验证了我们的方法在几个基准上实现了最先进的方法。

translated by 谷歌翻译

Learning-based Autonomous Channel Access in the Presence of Hidden Terminals

Yulin Shao , Yucheng Cai , Taotao Wang , Ziyang Guo , Peng Liu , Jiajun Luo , Deniz Gunduz

分类：机器学习

2022-07-07

我们考虑了自主渠道访问（AutoCA）的问题，其中一组终端试图以分布式方式通过常见的无线通道发现具有访问点（AP）的通信策略。由于拓扑不规则和终端的通信范围有限，因此对AutoCA的实用挑战是隐藏的终端问题，在无线网络中臭名昭著，可以使吞吐量和延迟性能恶化。为了应对挑战，本文提出了一种新的多代理深钢筋学习范式，该学习范式被称为Madrl-HT，在存在隐藏码头的情况下为Autoca量身定制。 MADRL-HT利用拓扑见解，并将每个终端的观察空间转变为独立于终端数量的可扩展形式。为了补偿部分可观察性，我们提出了一种外观机制，以便终端可以从载体感知的通道状态以及AP的反馈中推断出其隐藏终端的行为。提出了基于窗口的全球奖励功能，从而指示终端在学习过程中平衡终端的传输机会，以最大程度地提高系统吞吐量。广泛的数值实验验证了我们的解决方案基准测试的优越性能，并通过避免碰撞（CSMA/CA）方案对旧的载体 - 义值访问。

translated by 谷歌翻译

Cryptocurrency Valuation: An Explainable AI Approach

Yulin Liu , Luyao Zhang

分类：人工智能 | (统计)机器学习

2022-01-30

Currently, there are no convincing proxies for the fundamentals of cryptocurrency assets. We propose a new market-to-fundamental ratio, the price-to-utility (PU) ratio, utilizing unique blockchain accounting methods. We then proxy various fundamental-to-market ratios by Bitcoin historical data and find they have little predictive power for short-term bitcoin returns. However, PU ratio effectively predicts long-term bitcoin returns. We verify PU ratio valuation by unsupervised and supervised machine learning. The valuation method informs investment returns and predicts bull markets effectively. Finally, we present an automated trading strategy advised by the PU ratio that outperforms the conventional buy-and-hold and market-timing strategies. We distribute the trading algorithms as open-source software via Python Package Index for future research.

translated by 谷歌翻译

Exploiting Both Domain-specific and Invariant Knowledge via a Win-win Transformer for Unsupervised Domain Adaptation

Wenxuan Ma , Jinming Zhang , Shuang Li , Chi Harold Liu , Yulin Wang , Wei Li

分类：计算机视觉

2021-11-25

无监督的域适应（UDA）旨在将知识从标记的源域传输到未标记的目标域。大多数现有的UDA方法通过学习域 - 不变的表示和在两个域中共享一个分类器来实现知识传输。但是，忽略与任务相关的域特定信息，并强制统一的分类器以适合两个域将限制每个域中的特征表达性。在本文中，通过观察到具有可比参数的变压器架构可以产生比CNN对应的更可转换的表示，我们提出了一个双赢的变压器框架（WINTR），它分别探讨了每个域的特定于域的知识，而同时交互式跨域知识。具体而言，我们使用变压器中的两个单独的分类令牌学习两个不同的映射，以及每个特定于域的分类器的设计。跨域知识通过源引导标签改进和与源或目标的单侧特征对齐传输，这保持了特定于域的信息的完整性。三个基准数据集的广泛实验表明，我们的方法优于最先进的UDA方法，验证利用域特定和不变性的有效性

translated by 谷歌翻译

OpenPrompt: An Open-source Framework for Prompt-learning

Ning Ding , Shengding Hu , Weilin Zhao , Yulin Chen , Zhiyuan Liu , Hai-Tao Zheng , Maosong Sun

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-03

快速学习已成为现代自然语言处理的新范式，它直接适应培训的语言模型（PLMS）到$ CLOZE $ -Style预测，自回归建模或序列到序列生成，从而导致各种任务的表现。但是，尚未提出及时学习的标准实施框架，以及大多数现有的及时学习码条，通常是不受管制的，仅为特定方案提供有限的实现。由于有许多细节，例如模板策略，初始化策略和语言化策略等，因此需要在快速学习中考虑，从业者面临障碍，以便快速调整所需的迅速学习方法到他们的应用程序。在本文中，我们展示了{OpenPrompt}，一个统一的易于使用的工具包，可以通过PLMS快速学习。 OpenPrompt是一项研究型框架，配备了效率，模块化和可扩展性，其组合性允许自由地将不同的PLMS，任务格式和提示模块组合在统一的范例中。用户可以宽松地部署快速学习框架，并在没有约束的情况下在不同的NLP任务上评估它们的泛化。 OpenPrompt在{\ url {https://github.com/thunlp/openprompt}}上公开发布。

translated by 谷歌翻译

StrucTexT: Structured Text Understanding with Multi-Modal Transformers

Yulin Li , Yuxi Qian , Yuchen Yu , Xiameng Qin , Chengquan Zhang , Yan Liu , Kun Yao , Junyu Han , Jingtuo Liu , Errui Ding

分类：计算机视觉 | 自然语言处理

2021-08-06

在视觉上丰富的文件（VRD）上的结构化文本理解是文档智能的重要组成部分。由于VRD中的内容和布局的复杂性，结构化文本理解是一项有挑战性的任务。大多数现有的研究将此问题与两个子任务结尾：实体标记和实体链接，这需要整体地了解令牌和段级别的文档的上下文。但是，很少的工作已经关注有效地从不同层次提取结构化数据的解决方案。本文提出了一个名为structext的统一框架，它对于处理两个子任务是灵活的，有效的。具体地，基于变压器，我们引入了一个段令牌对齐的编码器，以处理不同粒度水平的实体标记和实体链接任务。此外，我们设计了一种具有三个自我监督任务的新型预训练策略，以学习更丰富的代表性。 Structext使用现有屏蔽的视觉语言建模任务和新句子长度预测和配对框方向任务，以跨文本，图像和布局结合多模态信息。我们评估我们在分段级别和令牌级别的结构化文本理解的方法，并表明它优于最先进的同行，在Funsd，Srie和Ephoie数据集中具有显着优越的性能。

translated by 谷歌翻译

Deep Model Assembling

Zanlin Ni , Yulin Wang , Jiangwei Yu , Haojun Jiang , Yue Cao , Gao Huang

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-08

Large deep learning models have achieved remarkable success in many scenarios. However, training large models is usually challenging, e.g., due to the high computational cost, the unstable and painfully slow optimization procedure, and the vulnerability to overfitting. To alleviate these problems, this work studies a divide-and-conquer strategy, i.e., dividing a large model into smaller modules, training them independently, and reassembling the trained modules to obtain the target model. This approach is promising since it avoids directly training large models from scratch. Nevertheless, implementing this idea is non-trivial, as it is difficult to ensure the compatibility of the independently trained modules. In this paper, we present an elegant solution to address this issue, i.e., we introduce a global, shared meta model to implicitly link all the modules together. This enables us to train highly compatible modules that collaborate effectively when they are assembled together. We further propose a module incubation mechanism that enables the meta model to be designed as an extremely shallow network. As a result, the additional overhead introduced by the meta model is minimalized. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both final accuracy and training efficiency. For example, on top of ViT-Huge, it improves the accuracy by 2.7% compared to the E2E baseline on ImageNet-1K, while saving the training cost by 43% in the meantime. Code is available at https://github.com/LeapLabTHU/Model-Assembling.

translated by 谷歌翻译

AdaFocusV3: On Unified Spatial-temporal Dynamic Video Recognition

Yulin Wang , Yang Yue , Xinhong Xu , Ali Hassani , Victor Kulikov , Nikita Orlov , Shiji Song , Humphrey Shi , Gao Huang

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-27

最近的研究表明，减少时间和空间冗余都是有效的视频识别方法的有效方法，例如，将大多数计算分配给与任务相关的框架或每个帧中最有价值的图像区域。但是，在大多数现有的作品中，任何一种类型的冗余通常都是用另一个缺失建模的。本文探讨了在最近提出的ADAFOCUSV2算法之上的时空动态计算的统一配方，从而有助于改进的ADAFOCUSV3框架。我们的方法仅在一些小但有益的3D视频立方体上激活昂贵的高容量网络来降低计算成本。这些立方体是从框架高度，宽度和视频持续时间形成的空间中裁剪的，而它们的位置则以每样本样本为基础的轻加权政策网络自适应地确定。在测试时间，与每个视频相对应的立方体的数量是动态配置的，即，对视频立方体进行顺序处理，直到产生足够可靠的预测为止。值得注意的是，可以通过近似可插入深度特征的插值来有效地训练adafocusv3。六个基准数据集（即ActivityNet，FCVID，Mini-Kinetics，Something Something V1＆V2和潜水48）上的广泛经验结果表明，我们的模型比竞争性基线要高得多。

translated by 谷歌翻译